Search CORE

ProGMap: an integrated annotation resource for protein orthology

Author: He Ying
Kuzniar Arnold
Leunissen Jack A. M.
Lin Ke
Nijveen Harm
Pongor Sándor
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Current protein sequence databases employ different classification schemes that often provide conflicting annotations, especially for poorly characterized proteins. ProGMap (Protein Group Mappings, http://www.bioinformatics.nl/progmap) is a web-tool designed to help researchers and database annotators to assess the coherence of protein groups defined in various databases and thereby facilitate the annotation of newly sequenced proteins. ProGMap is based on a non-redundant dataset of over 6.6 million protein sequences which is mapped to 240 000 protein group descriptions collected from UniProt, RefSeq, Ensembl, COG, KOG, OrthoMCL-DB, HomoloGene, TRIBES and PIRSF. ProGMap combines the underlying classification schemes via a network of links constructed by a fast and fully automated mapping approach originally developed for document classification. The web interface enables queries to be made using sequence identifiers, gene symbols, protein functions or amino acid and nucleotide sequences. For the latter query type BLAST similarity search and QuickMatch identity search services have been incorporated, for finding sequences similar (or identical) to a query sequence. ProGMap is meant to help users of high throughput methodologies who deal with partially annotated genomic data

Electronic Sumy State University Institutional Repository

Радіолокаційно-вихрострумовий метод виявлення металів

Author: Anand K Gavai
Farahaniza Supandi
Hannes Hettling
Jack A M Leunissen
Johannes H G M van Beek
Paul Murrell
Publication venue: Сумський державний університет
Publication date: 01/01/2014
Field of study

Сучасний георадар – це складний геофізичний прилад для неруйнівного контролю неоднорідностей середовища. В основі роботи георадару лежить підповерхневе зондуванняявище відбивання електромагнітної хвилі від межі поділу шарів з різною діелектричною чи магнітною проникністю. Такими межами є локальні неоднорідності різної природи. Георадари з великою вірогідністю визначають цю неоднорідність та глибину її залягання, але не можуть визначити склад неоднорідності, наприклад, це сталь чи золото. Тому виникла необхідність у створенні георадару без цього недоліку

VU Research Portal

FigShare

Ten Simple Rules for Developing a Short Bioinformatics Training Course

Author: Allegra Via
Anna Tramontano
David Landsman
Jack A. M. Leunissen
Javier De Las Rivas
Maria Victoria Schneider
Michelle D. Brazas
MV Schneider
MV Schneider
Teresa K. Attwood
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

This is an open-access article under the Creative Commonset.-- et al.This paper considers what makes a short course in bioinformatics successful. In today’s research environment, exposure to bioinformatics training is something that anyone embarking on life sciences research is likely to need at some point. Furthermore, as research technologies evolve, this need will continue to grow. In fact, as a consequence of the introduction of high-throughput technologies, there has already been an increase in demand for training relating to the use of computational resources and tools designed for high-throughput data storage, retrieval, and analysis. Biologists and computational scientists alike are seeking postgraduate learning opportunities in various bioinformatics topics that meet the needs and time restrictions of their schedules. Short, intensive bioinformatics courses (typically from a couple of days to a week in length, and covering a variety of topics) are available throughout the world, and more continue to be developed to meet the growing training needs.This work was partly supported by the Intramural Research Program of the NIH, NLM, NCBI, and by funds awarded to the EMBL-European Bioinformatics Institute by the European Commission under SLING, grant agreement number 226073 (Integrating Activity) within Research Infrastructures of the FP7 Capacities Specific Programme EMBL-EBI.Peer reviewe

The University of Manchester - Institutional Repository

Digital.CSIC

Archivio della ricerca- Università di Roma La Sapienza

University of Melbourne Institutional Repository

Constraint-based probabilistic learning of metabolic pathways from tomato volatiles

Author: Anand K. Gavai
Arnaud Bovy
C Meek
E Baldwin
E Yilmaz
E Yilmaz
Fred van Eeuwijk
G Suizdak
Harm Nijveen
I Tsamardinos
J Kopka
Jack A. M. Leunissen
K Morgenthal
M Kalisch
M Zou
MI Jordan
MJ Beal
N Friedman
N Schauer
Peter J. F. Lucas
R Gohlke
R Opgen-Rhein
R Ursem
Remco Ursem
S Moco
W Weckwerth
Y Tikunov
Yury Tikunov
Publication venue: Springer US
Publication date: 01/01/2009
Field of study

Clustering and correlation analysis techniques have become popular tools for the analysis of data produced by metabolomics experiments. The results obtained from these approaches provide an overview of the interactions between objects of interest. Often in these experiments, one is more interested in information about the nature of these relationships, e.g., cause-effect relationships, than in the actual strength of the interactions. Finding such relationships is of crucial importance as most biological processes can only be understood in this way. Bayesian networks allow representation of these cause-effect relationships among variables of interest in terms of whether and how they influence each other given that a third, possibly empty, group of variables is known. This technique also allows the incorporation of prior knowledge as established from the literature or from biologists. The representation as a directed graph of these relationship is highly intuitive and helps to understand these processes. This paper describes how constraint-based Bayesian networks can be applied to metabolomics data and can be used to uncover the important pathways which play a significant role in the ripening of fresh tomatoes. We also show here how this methods of reconstructing pathways is intuitive and performs better than classical techniques. Methods for learning Bayesian network models are powerful tools for the analysis of data of the magnitude as generated by metabolomics experiments. It allows one to model cause-effect relationships and helps in understanding the underlying processes

Public Library of Science (PLOS)

Radboud Repository

Gene Expression in Chicken Reveals Correlation with Structural Genomic Features and Conserved Patterns of Transcription in the Terrestrial Vertebrates

Author: Aart Lammers
AE Vinogradov
AJ Hulbert
AM Boutanaev
BY Liao
CI Castillo-Davis
Darren P. Martin
DK Kim
DK Kim
E Eisenberg
ET Chan
Evert M. van Schothorst
GK Smyth
H Caron
H Nie
Haisheng Nie
Hendrik-Jan Megens
Jaap Keijer
Jack A. M. Leunissen
M Kimura
Martien A. M. Groenen
P Khaitovich
PB Neerincx
Pieter B. T. Neerincx
RC Gentleman
Richard P. M. A. Crooijmans
RW Morgan
S Durinck
S Falcon
S van Hemert
S van Hemert
T Mijalski
W Huber
W Zhang
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Background - The chicken is an important agricultural and avian-model species. A survey of gene expression in a range of different tissues will provide a benchmark for understanding expression levels under normal physiological conditions in birds. With expression data for birds being very scant, this benchmark is of particular interest for comparative expression analysis among various terrestrial vertebrates. Methodology/Principal Findings - We carried out a gene expression survey in eight major chicken tissues using whole genome microarrays. A global picture of gene expression is presented for the eight tissues, and tissue specific as well as common gene expression were identified. A Gene Ontology (GO) term enrichment analysis showed that tissue-specific genes are enriched with GO terms reflecting the physiological functions of the specific tissue, and housekeeping genes are enriched with GO terms related to essential biological functions. Comparisons of structural genomic features between tissue-specific genes and housekeeping genes show that housekeeping genes are more compact. Specifically, coding sequence and particularly introns are shorter than genes that display more variation in expression between tissues, and in addition intergenic space was also shorter. Meanwhile, housekeeping genes are more likely to co-localize with other abundantly or highly expressed genes on the same chromosomal regions. Furthermore, comparisons of gene expression in a panel of five common tissues between birds, mammals and amphibians showed that the expression patterns across tissues are highly similar for orthologuous genes compared to random gene pairs within each pair-wise comparison, indicating a high degree of functional conservation in gene expression among terrestrial vertebrates. Conclusions - The housekeeping genes identified in this study have shorter gene length, shorter coding sequence length, shorter introns, and shorter intergenic regions, there seems to be selection pressure on economy in genes with a wide tissue distribution, i.e. these genes are more compact. A comparative analysis showed that the expression patterns of orthologous genes are conserved in the terrestrial vertebrates during evolutio

Edinburgh Research Explorer

A pipeline for high throughput detection and mapping of SNPs from EST databases

Author: A Ching
A Rafalski
A-C Syvanen
A. M. Anithakumari
AK Masouleh
BCY Collard
Ben Vosman
C Schlotterer
C. Gerard van der Linden
D Milbourne
DJ Somers
DL Hyten
DL Wheeler
E Jacobsen
G Ablett
G Barker
G Jander
GT Bryan
GT Marth
Herman J. van Eck
HV van Os
IY Choi
J Tang
Jack A. M. Leunissen
JBOA Fan
Jifeng Tang
JS Werij
JW Ooijen Van
KL McNally
N Rostoks
N Rostoks
P Vos
PS Hanneman RE
R Sachidanandam
R Shen
RA Hoskins
RE Voorrips
Richard G. F. Visser
S Feingold
SF Altschul
TJ Vision
Y-J Shen
YL Zhu
Publication venue: Springer Netherlands
Publication date: 01/01/2010
Field of study

Single nucleotide polymorphisms (SNPs) represent the most abundant type of genetic variation that can be used as molecular markers. The SNPs that are hidden in sequence databases can be unlocked using bioinformatic tools. For efficient application of these SNPs, the sequence set should be error-free as much as possible, targeting single loci and suitable for the SNP scoring platform of choice. We have developed a pipeline to effectively mine SNPs from public EST databases with or without quality information using QualitySNP software, select reliable SNP and prepare the loci for analysis on the Illumina GoldenGate genotyping platform. The applicability of the pipeline was demonstrated using publicly available potato EST data, genotyping individuals from two diploid mapping populations and subsequently mapping the SNP markers (putative genes) in both populations. Over 7000 reliable SNPs were identified that met the criteria for genotyping on the GoldenGate platform. Of the 384 SNPs on the SNP array approximately 12% dropped out. For the two potato mapping populations 165 and 185 SNPs segregating SNP loci could be mapped on the respective genetic maps, illustrating the effectiveness of our pipeline for SNP selection and validation

HSPVdb—the Human Short Peptide Variation Database for improved mass spectrometry-based detection of polymorphic HLA-ligands

Author: AI Nesvizhskii
Arnoud H. de Ru
Aurélie Viars
CA Bergen van
CC Oliveira
Chopie Hassan
D Stepniak
DN Perkins
E Spierings
F Reisinger
Harm Nijveen
HD Meiring
HS Hiemstra
J. H. Fred Falkenburg
Jack A. M. Leunissen
JH Falkenburg
JH Kessler
JK Eng
KD Pruitt
L Hambach
LC Eisenlohr
M Bleakley
Machiel de Jager
Michel G. D. Kester
N Hillen
N Salimi
NJ Edwards
O Ho
Peter A. van Veelen
PJ Kersey
R Storb
S Schandorff
ST Sherry
T Etzold
The UniProt Consortium
VH Engelhard
WA Marijt
Publication venue: Springer-Verlag
Publication date: 01/01/2010
Field of study

T cell epitopes derived from polymorphic proteins or from proteins encoded by alternative reading frames (ARFs) play an important role in (tumor) immunology. Identification of these peptides is successfully performed with mass spectrometry. In a mass spectrometry-based approach, the recorded tandem mass spectra are matched against hypothetical spectra generated from known protein sequence databases. Commonly used protein databases contain a minimal level of redundancy, and thus, are not suitable data sources for searching polymorphic T cell epitopes, either in normal or ARFs. At the same time, however, these databases contain much non-polymorphic sequence information, thereby complicating the matching of recorded and theoretical spectra, and increasing the potential for finding false positives. Therefore, we created a database with peptides from ARFs and peptide variation arising from single nucleotide polymorphisms (SNPs). It is based on the human mRNA sequences from the well-annotated reference sequence (RefSeq) database and associated variation information derived from the Single Nucleotide Polymorphism Database (dbSNP). In this process, we removed all non-polymorphic information. Investigation of the frequency of SNPs in the dbSNP revealed that many SNPs are non-polymorphic “SNPs”. Therefore, we removed those from our dedicated database, and this resulted in a comprehensive high quality database, which we coined the Human Short Peptide Variation Database (HSPVdb). The value of our HSPVdb is shown by identification of the majority of published polymorphic SNP- and/or ARF-derived epitopes from a mass spectrometry-based proteomics workflow, and by a large variety of polymorphic peptides identified as potential T cell epitopes in the HLA-ligandome presented by the Epstein–Barr virus cells

Leiden University Scholary Publications

Methods for interpreting lists of affected genes obtained in a DNA microarray experiment

Author: A Alexa
A Bonnet
A Jiménez-Marín
A Skarman
Agnès Bonnet
Arun Kommadath
Axel Skarman
B Zhang
Bart Buitenhuis
Christèle Robert-Granié
Cristina Arce
D Prickett
D Prickett
Dennis Prickett
DJ de Koning
dW Huang
F Jaffrezic
Francesco Ferrari
GK Smyth
Gwenola Tosser-Klopp
H Nie
Haisheng Nie
Henrik Hornshøj
I Hulsegge
Ina Hulsegge
Jack AM Leunissen
Jakob Hedegaard
Jan van der Poel
JJ Goeman
JJ Goeman
Johanna MJ Rebel
Juan J Garrido
KD Dahlquist
KH Pan
Laurence Liaubet
Lene N Conley
Li Jiang
M Ashburner
M Kanehisa
M Watson
Magali SanCristobal
Mari A Smits
Martien AM Groenen
María Ramirez-Boo
Melania Collado-Romero
Michael Watson
N Salomonis
P Casel
P Sorensen
PBT Neerincx
PBT Neerincx
Peter Sørensen
Pieter BT Neerincx
Q Liu
Q Zheng
S Falcon
S Song
Sandrine Lagarrigue
Silvio Bicciato
SW Doniger
Ángeles Jiménez-Marín
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

BACKGROUND: The aim of this paper was to describe and compare the methods used and the results obtained by the participants in a joint EADGENE (European Animal Disease Genomic Network of Excellence) and SABRE (Cutting Edge Genomics for Sustainable Animal Breeding) workshop focusing on post analysis of microarray data. The participating groups were provided with identical lists of microarray probes, including test statistics for three different contrasts, and the normalised log-ratios for each array, to be used as the starting point for interpreting the affected probes. The data originated from a microarray experiment conducted to study the host reactions in broilers occurring shortly after a secondary challenge with either a homologous or heterologous species of Eimeria. RESULTS: Several conceptually different analytical approaches, using both commercial and public available software, were applied by the participating groups. The following tools were used: Ingenuity Pathway Analysis, MAPPFinder, LIMMA, GOstats, GOEAST, GOTM, Globaltest, TopGO, ArrayUnlock, Pathway Studio, GIST and AnnotationDbi. The main focus of the approaches was to utilise the relation between probes/genes and their gene ontology and pathways to interpret the affected probes/genes. The lack of a well-annotated chicken genome did though limit the possibilities to fully explore the tools. The main results from these analyses showed that the biological interpretation is highly dependent on the statistical method used but that some common biological conclusions could be reached. CONCLUSION: It is highly recommended to test different analytical methods on the same data set and compare the results to obtain a reliable biological interpretation of the affected genes in a DNA microarray experimen

Repositorio Institucional de la Universidad de Córdoba